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CD4 glycoprotein on the surface of T cells helps 
in the immune response and is the receptor for 
HIV infection. The structure of a soluble fragment 
of CD4 determined at 2.3 A resolution reveals that 
the molecule has two intimately associated 
immunoglobulin-like domains. Residues implicated 
in HIV recognition by analysis of mutants and anti- 
body binding are salient features in domain Dl. 
Domain D2 is distinguished by a variation on the 
/3-strand topologies of antibody domains and by 
an intra-sheet disulphide bridge. 



CD4, a cell-surface glycoprotein found primarily on T lym- 
phocytes, is required to shape the T-cell repertoire during thymic 
development and to permit appropriate activation of mature T 
cells 1 . T cells that recognize antigens associated with class 11 
major histocompatibility complex (MHC) molecules, mainly T 
helper cells, express CD4, Evidence is accumulating that CD4 
and the T-eel) receptor coordinated engage class II molecules 
on antigen-presenting cells to mediate an efficient cellular 
immune response, and that engaged CD4 may transmit a signal 
to an associated cytoplasmic tyrosine kinase* p56 e . 

CD4 belongs to the immunoglobulin supeifamily of molecules 
which generally serve in recognition processes 2 '*. The sequence 
of CD4 4 - 5 indicates that it consists of a large (-370 residues) 
extracellular segment composed of four tandem immuno- 
globulin-likc domains, a single transmembrane span, and a short 
(38 residues) C-terminal cytoplasmic tail. The first domain (Dl) 
shares several features with immunoglobulin variable domains, 
but the sequence similarities between immunoglobulins and 
the other extracellular domains (D2, D3 and D4) are more 
remote. 

In humans, CD4 can be subverted from its normal lmmuno- 
supportive role to become the receptor for infection by the 
human immunodeficiency virus (HIV) 1,6,7 . Recombinant soluble 
CD4 proteins bind to the HIV envelope glycoprotein gpl20, 
and can thus inhibit viral infection and virus -mediated cell 
fusion in vitro (refs 8 t 9 and references therein). Domain Dl 
suffices for high-affinity binding to gpl20 (ref. 8), and the analy- 
sis of substitution mutants further limits the sites of interaction 
to discrete regions in the domain 8 * 10 " 13 . 

Crystals of whole soluble CD4 (sCD4) molecules have been 
grown 1 *' 30 but there has been limited success in achieving 
adequate diffraction order. The high solvent content and weak 
diffraction of several characterized polymorphs of human sCD4 
arc compatible with an extended, flexible molecule 1 . From the 
pattern of proteolytic cleavages that generate stable fragments 
(refs 21-23 and unpublished results), the main flexibility seems 
to be at the D2 to D3 junction. We have now crystallized a 
truncated derivative of CD4 that diffracts well, and here we 
report its atomic structure. This recombinant fragment as 
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secreted from Chinese hamster ovary (CHO) cells consists of 
residues 1-183 of human CD4 plus two missensc residues, 
Asp-Tbr; and it is unglycosylatcd. This molecule, which we refer 
to as Dl D2, is as active as SCD4 in binding to gpl20 (dissociation 
constant ^ d = 3nM) and retains all antibody epitopes mapped 
to these domains of CD4 (ref- 8 and unpublished results). Others 
have crystallized similar fragments from the N-terminal half of 
sCD4 2 ° 5 and the structure of one is reported in the accompany- 
ing papers- 
Here we describe the DID2 structure in comparison with that 
of immunoglobulin domains, provide a geometrical definition 
for HIV recognition sites, and discuss implications of the 
structure for normal CD4 function and evolution of the 
immunoglobulin family. We find that the domains of CD4 are 
indeed immunoglobulin-like, although there arc significant 
differences from the antibody analogues. The primary sites for 
HIV interaction are on loops that protrude from the variable-like 
Dl domain in analogy with inimunoglobin complementarity- 
determining regions (CDRs). The D2 domain, which is inti- 
mately associated with Dl, resembles constant domains but it 
is distinguished by a strand topology that is variabic-like. 

Structure determination 

D1D2 protein secreted from CHO cells was purified to 
homogeneity by following procedures similar to those used for 
sCD4 (ref, 26). Crystals of this protein grown from polyethylene 
glycol (PEG) and stabilized at pH 8.2 (Table I) belong to space 
group C2 and have unit cell dimensions of a = 83.71 A, b ~ 
30.07 A, 87.54 A, fi = 117.3°. They contain one D1D2 
molecule per asymmetric unit and 50% solvent. In searching for 
derivatives, wc soaked crystals into stabilization medium doped 
with various heavy-atom compounds. 

Diffraction data for the structure determination (Table 1) were 
measured at an area detector facility 27 where we collected CulCor 
data for the native protein and for candidate derivatives. Observ- 
able intensities could be recorded to 1.9 A spacings rrom fresh 
native crystals, and even more strongly from the K 3 J*t(N0 2 )^ 
derivative. We could also measure multiwaveiength anomalous 
diffraction (MAD) data from this platinum derivative using 
characteristic gold emission lines which bracket the PtL,,, 
absorption edge 28 . Phases were evaluated by both multiple 
isomorphous replacement (MIR) and MAD methods (Table 1). 
Electron density maps from MIR phasing (including anomalous 
data) at 2.7 A resolution and from MAD phasing at 3.5 A 
resolution both showed similar features. We then probabilisti- 
cally combined MIR and MAD phase information to produce 
a combined map at 2.7 A resolution (Fig. la) which showed 
distinct solvent channels and an apparent two-domain structure. 
A complete chain-tracing consistent with the amino-acid 
sequence for the first domain was possible, but the second 
domain remained difficult to interpret. By using a hand-drawn 
molecular envelope, we then performed solvent flattening and 
density truncation to improve the map further (Fig. lb). This 
enabled us to trace the chain through the second domain. The 
C-tcrminal 12 residues (174-185) could not be seen. 
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A complete model was built into the solvent-flattened map 
as displayed by FRODO 29 program on a graphic workstation. 
Fragments from well-refined structures were fitted onto C<* 
guide points measured from a minimap and used as the starting 
point for building. Refinement made use of PROLSQ* 0 and 
XPLOR 3 ' programs and included severaJ manual- rebuilding^ 
Resolution of the analysis was gradually extended to 2.3 A, 
After analysis of model 5 (Table 1) at R = 0.208, we discovered 
on comparing manuscripts that this model and one developed 



by Wang &t a/. 2$ differed in the alignments of sequence in two 
strands (E of Dl and B of D2) and in conformations at residues 
L. 103-108, 134-139 and ISl-154. We then tested a model with 
residues 66-73 and 112-120 positioned in the former places of 
64-71 and 114-122, respectively, and with the 134-139 region 
also rebuilt. Further refinement produced muihd 6 (Table I) 
with an R value of 0.196 and 4 stereochemical ideality typified 
by an r.m.s. deviation of 0.018 A from ideal bond lengths. Study 
of this still incompletely refined model confirms the revised 



TABl£ 1 Statist from trie cry3tajlographic analysis 
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(c) MAO analysis 
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(d) Solvent flattening 
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(e) Refinement 
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Crystallization. D1D2 was purified lr> sciential steps of ion-exchange chromatography 1 * involving S-Sepharose capture at pH 5.2, Q-Sephorosc passage at ph 9.0 and o fined 
elutlon Trom S-Sepharose at pH 6.0 in 0.6 M NaCI. Cry^C6U)tztftton was carried out at 20 *C by equilibration gainst reservoirs of 20-25% PEG 3350 waning from hanging droplets 
composed of protein at -20mgml" 1 in PBS at pK70 and equal volumes from me reservoir. Collected crystals were transferred to e stabilization medium (30% PEG 33b0, 
80 mM Tris buffer. pH 8.2). TypitUtl Crystals have dimensions of 0-5 *0.3 *0.2 mm. Diffraction data fey MIR phasing and for rcfmGment were measured with CuKa X rays generated 
by a RigaXu RU200 rotating anode and detected on two multlwtre chambers on the Mark til system at UCSD. Aj| data were collected at or near 20 *C. The Inverse beam method 
was u$ed to collect Bijvoet mates; mat Is, setting:; were changed from (<* \; <fc) to (** -x, b + it) after rotations at each crystal orientation. Data for MAO phasing v»ere collected 
on me Mark II area detector system. A graphite moncchromator was used to select specific characteristic lines produced from a gold-plated anode installed on en Elliot GX-6 
generator. The mdlrtchro<t»ator was tuned to 50% intensity at one side or the other of the unresolved L^/l^j doublet for selection of a primary monochromaUc component. 
The UCSP area detector processing package was used for the reduction of all data. Values given inside of parentheses for numbers of reflections and R^s* arc without merging 
of Friedel mates (ff^^ Ik* - l^fo MIR analy*!*. rfeavy-aiom positions were determined from Patterson syntheses. The refinement or heavy-atom parameters and 
the phase calculations were performed with a version of program PHARE modified by Z Otwtnowski (personal communtcabon) for an Improved error treatment. Relative occupancies 
for the KjPt(NO.J 4 and K-PtQ ft Privative* were (1.00, 0.94) and (0.S3, 0.51, 0,25, 0,22,0.19V respectively. The first two heavy-atom sites of the KjPtCI 6 derivative are in common 
with the sites of the K^PtlNO^ derivative, Parameters cited in the table ar* i n calculated heavy-atom Structure -factor contributions; AF CMh9 , calculated. Bijvoot difference; c", ao . 
r,m,s. isomorphous closure error: e'^, r.m.s. anomalous closure error; and m, mean ttgure -of -merit MAD anary*|». The MADSYS system ot programs* 4 was used to produce 
initial WAD phases. Phase combination between MIR and MAD was performed by the combination of A BCD coefficients 1 *. But. in this case it was necessary to modify the original 
MAD refinement formulation^ so as to pmduce phases tor the native non-anomalously scattering component™ of the plaiinyl CD4 complex. The previously used method for 
extracting A BCD coefficients 50 proved to be ineffective m mis case; instead, these phasing coefficients were extracted directly from MAD phases and figures-of*merit w . MIR and 
MAO phases were combined at 20.0-3.5 A; MtR phases were used alone for the 3.S-2.V A data Observed anomalous diffraction difference ratios are given tor the data between 
20.0-3,5 A. Bijvoet (diagonal elements of matrix) arKl dispersive (off -diagonal elements) diffcrenco ratios represent r.m.s. {*F ^rms. if) and r.m.s, [bF^ Vr.m.s. {F), respectively. 
Solront flattening was performed following procedures used in the structure analysis of myc+wmerythrm 311 . After interpretation of the prst domaia a molecular envelope was 
delineated which Included 58% of the asymmetric unit of the crystal (as compared with 30% ror the true solvent content) to ensure that ap protein density was included. After 
subtraction of the mean solvent density and truncation of the lowest density values within the molecular boundary, the contents of the modified density function were Fourier 'Inverted. 
Phasing coefficients from the inversion were reduced by 0.5 before combination with ABCD values from MtR. A truncation level to eliminate the lowest 20% of molecular points 
(as compared with 0% and 10%) was chosen as giving the best definition for Known features in domain Dl. The phase refinement was carried out starting from the 20 0-4.5 X 
shey. and Including higher angle reflections successively until finally reaching 2.7 A spadngs. Five cycles or solvent flattening were carried out for each shell between shell 1 and 
shell 12, ana 10 cycles were run for shell 13. R CrysuirtograpNc R value between observed structure factor magnitudes and calculated values from the solvent-flattened map. 
A<£ Phase differences between phases from MIR plus MAD and phases from the solvent flattened map. Refinement Selected points along the course of refinement are (isted. 
for all steps, the data having F^> AaiF^) wore included. Step 0 refers to the unrefined starting modal after adjustment of me scale factor only. From step 2 onward, the 
data between 10.0 and 6i) A were included which helped to achieve better connectivities in 2F (M - F^ maps. The final model at 2.3 A resolution is based on the 5,681) reflections 
(64%) which met the 4& criterion. This partially refined model intrudes the 1,420 atoms from residues 1-173 plus the 72 water sites. Thermal parameters were restrained as 
typified by an rJti.S. discrepancy ot 1.1? A in B values between bonded main-chain atoms. 1^-^!^ ^«t«: bond deviation from ideality (A). 
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FK5. 1 Electron-density distributions used in the structural determination. 
The portion displayed in each panel includes the segment Phe 26-His 27* 
Tip 28-Lys 29 with the partially refined model 5 (yellow) superimposed on 
the density (Dlue). a, Experimental map at 2.7 k resolution based on combined 
MIR plus MAD phase information; m -0.5S. t>. Experimental map at 2.7 A 
resolution after chas e refine ment by solvent flattening and density trunca- 
tion; m-0.86. c Reffncd 2|f M J-|/ r Mto | map with model phases after 
refinement at 2.3 A resolution; =0.208, 



alignment in Dl and supports the changes made in D2, although 
several loops in D2 remain ill-defined. All conclusions drawn 
from model 5 remain unchanged, but the precise numerical 
quantities and Figs 26, 4 and 5a were redone from model 6. 



Overall structure 

The D1D2 structure is of the all-0 type. It consists of two 
intimately associated domains, each of which is a 0 sandwich 
folded with immunoglobulin-strand topology. The polypeptide 
folding is shown in Fig. 2. The first domain (Dl) comprises 
residues 1-98 disposed in nine p strands, and the second domain 
(D2) contains residues 99-173 and has seven 0 strands- These 
domain boundaries are in striking correspondence with intron 
boundaries immediately after residues 100 and 177 in the gene 
structure 3 - The suggestion that J-likc regions follow these intron s 
is not borne out by the structure. 

The last strand of Dl continues straight into the first strand 
of D2, running for a length of 49 A from residues 88-103, This 
tandem association of domains leads to a rod-shaped molecule 
of dimensions roughly 25 x 25 x 60 A- By comparing the solvent- 
accessible surface area of D1D2 with that of the separated 



F|G 2 Backbone structure of the 01D2 fragment of C04. a, Schematic 
diagram (copyright Yarmolinski and Hendrickson, 1990). b, stereodiagram 
of the a-carbon backbone. Positions of residue numbers divisible by ten 
are indicated by small spheres, sulphur atoms in disulphida bridges are 
indicated by larger spheres. The point of view is down the T axis after 
rotations about X(100°), Z'(-30a Y'(-90*) and X' (10*) starting with the 
frame having X.Y.2 along a, b ana c*. 
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domains, we found that 310 A 3 from each domain is buried in 
the interface. This compares with 110 A 2 for the Vh-Ch inter- 
face of Fab New 32 . Intcrdomaiu contacts are largely hydro- 
phobic. Residues involved in intcrdomain contacts include Val 3, 
Uu 5, lie 76, Leu 96, Val 97 and Phe 98 from Dl, and Glu 1 19, 
Pro 121. Gin Vg| 168 and Phe 170 from D2. The surface of 
PJD2 has many positively charged amino acids (the calculated 
pi of residues 1-183 is 10.0), and the electrostatic potential 
surface 33 computed from the model is mostly positive at neutral 
pH. There' arc, however, two prominent patches of negative 
potential: one is associated with Glu 8S, Glu 87 and Asp 88 and 
the other is from Asp 105, Asp 153 and Asp 173- 



TABLE 2 Re&Oue* implicated in HIV recognition by point mutation in Dl of CD4' 
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• Within the 98 ammo-acid residues of the Dl domain, there is no mutational 
Information on residues 3-6. 13. 14. 16, IS, 26, 69-71, 76, 83. 84 and &7. However, 
the binding of gpi20 to Rnsaua CD4 ,(I and to the human-rat chimacnc protein 
eliminates 12. 14. 18 and 76 from this list. Mutations which caused global alteration 
0T structure as evidenced by reduced binding of several anti-CD4 monoclonal antibodies, 
were not considered 1 ** 1 * Such mutations were the only data for 13 amino acids and 
thus led to exclusion of residue* 7. 13. 28, 36. 37, 94. 67. 62. 7a 79, 8Z S3 and 95. 
thus, in total. 2S amino aoCs are excluded from consideration. 

X Fractional solvent accessibility was calculated as the ratio of sorvenuaccessiwc 
surface area for atoms of an amino. acid residue x in the protein to that area obtained 
after reducing the structure to a By-X-Gly ulpeptlde M . Values cited are for side-chain 
atom* except In the case of glycine residues where main-chain values are given. 

t Mutants are identified by single-letter coca. Mutational effects arc symbolized by 
blackening in steps proportional to the ouanlitational sensitivity of the assays a* 
reported. Only single amino-acid substitutions are Included. Tl»e results from double 
substitutions largely confirm these data. Certain double mutations are of particular 
note- A G41S/F43C severely disrupts binding 10 , emphasising the significance of Wa- 
ft L51M/A55S does not affect binding", indicating that the severe effect by V* A55F 
substitution i* conformational in nature: C, E87G/K90G ia and DS8N/Q89fc do not 
affect biftdng, further indicating that this region is not involved in Initial binding 10 gl20. 

$ Relative effects of the indicated mutations on virus or gpi20 binding to cell 
Surface CD4 or gplZO binding to soluble CD4 proteins** 4 - 10 . Tne results shown ore 
an interpretation of the primary data and comparisons among the results from the 
different laboratories ere approximate. 

|l Relative ability of cells bearing CO* mutant proteins to form syncytium with cells 
expressing viral envelope protelns To17 or of soluble C04 proteins to inhibit Syncytium . 

1 The Q40A mutation increased binding affinity to gp!20 twofold. 
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The D1D2 crystal lattice contains diad axes that could 
accommodate a dimeric molecule. However, contacts across 
screw axes dominate the lattice interactions and the only diad- 
mediated interface (between 01 domains) seems unlikely to 
persist in solution. The lattice interactions of Dl arc more 
extensive than those of D2 which is reflected in molecular 
mobilities of the domains: the average R value is 19.9 A 2 for 
Dl compared with 37,8 A 2 for D2. This accounts for our greater 
difficulty in tracing the D2 chain. 

CD4, like many members of the immunoglobulin superfumily, 
is a single-chain cell-surface receptor composed of tandem ly 
repeated domains. Some of these domains have been assigned 
as V-Iike and others as (Mike or in the 'C2 sct ,M . The D1D2 
domain of CD4 contains both types and, apart from the similar- 
ity of CDS with the PapD bacterial chaperonc protein 35 , this is 
the first three-dimensional structure for such members of the 
family. 

Variable-tike domain Dl 

As anticipated by sequence comparisons, Dl is similar to ihe 
variable (V) domains of antibodies. A schematic drawing of the 
folding pattern with standard immunoglobulin designations for 
the secondary structural elements is shown in Fig. 3* Strand* B, 
D and £ make up one 0 sheet, and strands A, C, C, C", F and 
G are in the other sheet of the fi sandwich. In immunoglobulin 

V domains, the first half of strand A is hydrogen-bonded to 
strand B of the outer sheet but the second half switches to the 
inner sheet. Strand A of Dl is foreshortened and occupies only 
the 'inner-sheet* position, hydrogen-bonded in parallel with 
strand G. Dl preserves the inter-sheet disulphide bridge and 
several other elements of the hydrophobic core characteristic of 
the immunoglobulin framework. Starting from these alignments 
wc have superimposed Dl onto representative immunoglobulin 

V domains from Bcnce- Jones protein Rei (Vl*)**. and Fab 
New (VlA and Vil) i2 - The structurally aligned sequences are 
shown in Fig- 4. Dl superimposes remarkably well onto the 
/3-sheet framework of all V domains, with a best match to Rci 
bringing 72 Cc* atoms to an r,m.s. discrepancy of 1.22 A from 
exact superposition. 

In striking contrast with the conserved core of /3 strands, 
several loops between strands in CD4 are quite different from 
those in immunoglobulin V domains. In particular, loop CC is 
shortened by four residues and loop FG (immunoglobulin 
CDR3) is shortened by four to six residues from canonical 
immunoglobulin lengths. These loops mediate Vh-Vl dimeri*- 
ation in immunoglobulins, an interaction that does not occur 
With CD4. One addition, important for gp 120 binding, is the 
lengthening of loop C'C" (immunoglobulin CDR2) by three 
residues in CD4 compared with k light chains such as Rei. In 
this regard, CD4 more closely resembles Vh domains even 
though overall it superposes best with Rei. This CDR2-like loop 
interacts with Trp 62. not found in antibodies, and juts out at 
the tip- Few of the loop changes in Dt were detected in previous 
alignments with Rei, and consequently structural predictions 
based on Rei 37 have failed to capture the essence of CD4. The 
salience of the CDR2 analogue of CD4 and the diminutive 
nature of the CC loop and the CDR3 analogue are evident in 
Figs 2 and 3. 

Distinctive domain D2 

Domain D2 of CD4 has a tertiary folding that can readily be 
recognized as resembling immunoglobulin constant domains. 
Again following standard immunoglobulin nomenclature (Fig. 
3), one p sheet of the sandwich-like structure contains strands 
A, B and £, and the other sheet has strands C. C, F and G. 
Despite the general similarity of D2 to constant domains, details 
of the folding are considerably divergent. First, the size of D2 
is small (75 residues) compared with that of immunoglobulin 
constant domains (-100 residues). This manifests itself in 
shorter strand lengths. A second variation is that strand C . 
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FiG. 3 Schematic comparisons between domains of CD4 and other 
immunogtorxdln-folded domains, 4 Ribbon diagrams comparing 01 with a 
variable A light-chatn domain and a variable heavy-chain domain. The point 
of view is as in Fig. 2; the CDR loops of immunoglobulins and their analogues 
in Dl are indicated by numerals, to, Ribbon-diagram comparisons of 02 with 
a constant domain and with domain D2 of the bacterial chaperons protein 
PapD, 4 Topology diagrams and strand nomenclature for the p sheets in 
CD4 and in variable and constant domains of immunoglobulins. 



which corresponds sequentially to the D strand of normal con* 
stant domains, joins the p sheet consisting of strands C, F and 
G, rather than being hydrogen-bonded lo strand E in the other 
sheet. In this respect, D2 is V-like, A final striking difference 
from immunoglobulins is that D2 has its disulphide bond 
between strands in the same sheet, that is strands C and F (Fig. 
2), rather than between stoeerras is usual. This feature, which 
is unusual but not unprecedented™, had been predicted*; 
however, strand C was not anticipated. The hydrophobic core 
of immunoglobulin domains is preserved in D2 despite the 
unusual disulphide connection (Fig, 4). Leu 116 on strand B 
occupies the position of a normal cysteine partner for Cys 159 
on strand F- The inward orientation of the intra-shect disulphide 
bridge makes it an important member of the hydrophobic core 
even though it does not join the sheets. 

D2 can be structurally aligned with the Cm! domain of Fab 
New (Fig. 4), but the resulting spatial superposition is not as 
close as for Di comparisons with V domains. Only six of the 
seven strands superimpose at a 2.5 A stringency for matches, 
and this gives 33 matches with an r.m.s. discrepancy of 1.59 A. 
D2 superimposes somewhat less well with the Rci variable 
domain (30 matches within 2.5 A for 1.68 A r.m.s. discrepancy). 
In fact, D2 is actually more similar to the domains of PapD" 
than to immunoglobulin domains. As in the D2 topology, PapD 
domains also exhibit sheet switching (partially in Dl(PapD) 
fully in D2(PapD))> and 39 Ca atoms of D2 (CD4) superimpose 
on Dl(PapD) with an r.m.s. discrepancy of 1.55 A. Drawings 
of the aligned structures arc shown in Fig. 3. 

Binding site for gpl20 

The initial molecular event in HIV infection is the binding of 
gpl20 on the viral envelope to CD4 on the cell surface. The 
high affinity of this interaction, at least for several laboratory 
strains of the HIV-1 virus 3 *, has permitted a detailed mapping 
of the binding site. More than 200 mutant CD4 recombinants 
have been constructed and tested for gpI20 affinity (refs 8, 
10-18, J. Arthos at aL, manuscript submitted, and unpublished 
results). Some of these mutant proteins have also been used to 
map the CD4 epitopes of a battery of monoclonal antibodies. 
In Table 2 we distil the mutation results to those point mutants 
in Dl which impair gp!20 binding without causing extensive 
disruptions of structure as judged by antibody binding. Analyses 
have been reported for mutations at 80 of the 98 residues in 
Dl, but only 19 positions among them have impact on gpl20 
binding without apparent global conformational change. Thir- 
teen of these are in the span from 38 to 59. Completely buried 



FlG. 4 Structural alignment of the amino-acid sequences of 
CD4 domains with other immunoglobulins elated domains. 
Shaded residues have Ca positions within 2.5 A of corre- 
sponding CD4 positions after optimal superposition of all 
shaded residues for a given pair of domains. (Exceptions up 
to 2.60 A were allowed for residues in the middle of strands.) 
Each superposition relates a certain number of positions, N, 
within the specified 2.5 A limit, and these match at a certain 
r.m.s. discrepancy, A. For the match of 01 with Rei (Vlk). 
N =72 and A =■ 1.22 A; for 01 versus New (WA), N-66 and 
A=1.12A; and for Dl versus. New (Vh>, N -68 and A= 
1.13 A. For the match of D2 with New <Ch1), N =33 and 
A -1.59 A; for 02 versus PapD<D2). N = 32 and A = 1-26 A; 
for D2 versus PapD(Di), rV = 39 "and A = 1.55 A; and for D2 
versus Rei (VX*), A/=30 and A =1.68 A. The alignments of 
02 versus PapDtDl) and P2 versus Rei (Vu*) are not shown. 
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residues can be discounted as direct participants in binding 
because their exposure would require major conformational 
change, which is without precedent for immunoglobulin 
domains. We have calculated the degree of exposure for each 
residue in Di (Tabic 2), and find that three of the gpl20-senstttve 
residues, Gly38, Ala 55 and P he 67, arc inaccessible. It seems 
unlikely that either Lys29 or Gin 64 is critical for binding 
because there is no effect for mutations at the more exposed 
neighbours of Lys 29 or for single or multiple mutations at 64, 
except alanine. Thus only the alanine replacements at 77, 81 
and 85 fall outside of the 41-59 span. The distribution of binding 
mutants is shown in Fig. 5a, b> 

The importance of the 41-59 region in HIV binding is corrob- 
orated by a chimacric CD4 in which the human sequence from 
36 to 62 has been inserted into rat CD4 (A. F. Williams, personal 
communication). This imparts full gpl20 binding activity on the 
normally un reactive nu CD4 molecule. It also shows that 
33 additional side chains which differ in rat and human 
CD4, including Gin 64, are not absolutely essential for the 
binding 4 . But, Glu 77, Thr 81 and Glu 85 are among the residues 
conserved between rodents and primates. Thus, whereas the 
41-59 region is positively identified as being involved in 
gp120 binding, participation of the 77-85 region cannot be 
excluded. 

Antibody binding studies also help to define the gpl 20 binding 



site (refs 10,40, and our unpublished results), pointing especially 
to the CDR2-like region. Two major groups of Dl-bindjng 
monoclonal antibodies have been described and their epitopes 
are illustrated in Fig. 5c, The epitope of one cross-blocking 
group, typified by Leu 3 a, maps to the CDR2-likc loop and 
usually includes portiQjis_oJlihe CDRMikc loop. These anti- 
bodies also block gpl20 binding. The epitope of the second 
group, typified by monoclonal L71 , includes the CDR3-like loop 
and sometimes a portion of the strand G. These antibodies 
inhibit viral infection and cell fusion, but do not efficiently block 
gpl 20 binding; some even form ternary complexes with gpl 20 
and CD4 (A. Trunch et a/., manuscript submitted). 

The CDR2-like region that is strongly implicated in gpl 20 
binding is also very prominent in the CD4 structure (Figs 3 and 
5). The C'-C hairpin loop at Gin 40-Phe 43 is almost completely 
exposed. Most strikingly, the hydrophobic side chain of Phc 43 
juts out into the solvent as shown in Fig. 5e r f. The involvement 
of a phenyl group in binding is also suggested by peptide 
inhibitor studies 41 * 42 . If the 77-85 span were also to be directly 
involved in gpl 20 binding, this would present a puzzle because 
this site is on a face nearly opposite from the CDR2-tike region 
(the vector from the molecular centre of Dl through C<*43 
makes an angle of 152° with the similar vector through CaSl), 
and no intervening residues have been implicated in gpl 20 
recognition. 




f (G. 5 Images of CD4 relating to Bites of interaction with HfV. a. Ca backbone 
of Dl with sites implicated by mutational and structural analysis to affect 
high-affinity gpl20 binding {see Table 2). Non-buried residues that show 
reduction in binding without global disruption of structure (as judged by 
antibody binding) are coloured red-orange (29, 41, 42, 43, 44, 45, 47, 49, 
52. 58, 59, 77, 81 and 85) on a backbone drawn by WORM (L Andrews). 
The orientation is as in Figs 2 and 3. b. Van der Waals' surface of Dl with 
side chains of gpl2G-sensitive residues coloured. Thosu showing marked 
reductions are in pink (43. 44, 85. are visible) and those showing moderate 
effects are in purple {59. 42. are visible). The point of view is from above 
a, looking down at the tips of COR-ilke loops, c. Van der Waals' surface of 
Dl with side chains of antibody epitopes coloured. Residues identified with 
epitopes of the I_eu3a family and shown in purpia (24. 25, 27,42. 43, visible) 
and those identified with the L71 f amity are shown in yellow (88, 89, visible). 
The view Is roughly as in b. d, Electrostatic potential surface computed with 
DELPHI 3 * at neutral pH. and displayed with AAK (A. Ntchotls and & Honig) 



at the levels of me solvent-accessible surface with a probe radius of 1.4 A. 
Blue represents positive potential, red negative, and white neutral. The 
negative patch is associated with residues 85. 8? and 88. The view is 
approximately as in b. a An all -atom representation of D1D2. Residues in 
the gpl20-binding region from residue 41 to 59 are drawn in red. The 
direction Dt view is roughly as in Fig. 2, but the molecule has been rotated 
by -'SO* about this view axis. This vjew illustrates the exposed nature of 
Phe 43. The actual conformation of this phenyl side chain in the crystal is 
at least partly determined by lattice interactions, Put its highly exposed 
nature would also be expected topersist in other conformations accessible 
in solution. f t Stereoview of the molecular surface in the major gpl20-blndina 
region of Dl. Atoms are drawn as in £ and are enveloped by the surface 
in contact with a probe sphere of 1.4 A radius as displayed by QUANTA 
(Polygen). The view is taken after rotation from e Dy —90* about the 
horizontal axis. 
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Fusion determinants 

After the initial binding event, entry of virus into the cell occurs 
through fusion of cellular and viral envelope membranes. In 
addition* HIV-infected cells fuse with uninfected CD4 + cells to 
form syncytia, a process mediated by inte ractions between HIV 
envelope protein expressed on the infected cell surface and CD4 
on the uninfected cell. Syncytium formation is thought to mimic 
viral entry and could be an important mode of viral spreading. 
Chimpanzee CD4, which binds to HIV but does not support 
syncytium formation, has Gly replacing Glu at residue 87. This 
replacement in human CD4 abolishes syncytia formation while 
preserving gpl20 binding 17 and, curiously, normal viral infection 
kinetics. Residues 88 and 89 are also implicated in this process 
by mutation (Table 2). These residues lie at the /3-turn tip of 
the CDR3-like loop in Dl and are part of a patch of negative 
potential on the otherwise mostly positive D1D2 surface (Fig. 
5d). The CDR3-likc loop is spatially separated from the CDR2- 
like loop implicated in high-affinity binding to gpUl) (CaBl is 
17 A from Ca43). in accord, monoclonal antibodies of the L7I 
family block syncytium formation although they permit gpl20 
binding to CD4. So, it seems that the CDR3-likc loop is necessary 
for secondary interactions between the viral envelope and CD4 
before fusion. This could relate to CD4-induccd release of gp!20 
from virus and infected cells 4 * 4i and exposure of a fusogenic 
domain on the membrane envelope protein gp4l. 

Immune response interactions 

The observation that CD4" lymphocytes react only with class 
ll-bearing target cells suggests that CD4 associates directly with 
class IT MHC, and it is thought that CD4 may also have a role 
in Signal transduction- There is evidence from cellular assays to 
support the association of CD4 with class II MHC molecules , 
the T-cell receptor 47 , and the p56 ,c * kinase 4 *, as well as other 
CD4 molecules 49 - Except for the interaction of CD4 with p56 c , 
however, these associations are necessarily weak and thus 
difficult to measure. Mutagenesis in vitro has been used to 
identify the sites on CD4 that interact with class 11 MHC 
molecules. On the basis of assays of cither cell adhesion 18 50 or 
interleukin-2 production 18 , large separated expanses of the CD4 
molecule, involving regions on Dl, D2 and D3, have been 
implicated in the CD4-MHC interaction. These findings arc not 
easy to reconcile with the structure of CD4. Some of the disrup- 
tive mutations occur in contact or bridging regions between the 
domains. Perhaps some of the other disruptions reflect complica- 
tions from the diverse components of the cellular system rather 
than direct binding. For example, self^associations of CD4 
might be involved in signal transduction and possibly in class 
II binding. 

Evolutionary implications 

There is little doubt that CD4 and immunoglobulins have evol- 
ved from a common ancestor. Intron and domain boundaries 
coincide and, although the sequences arc highly divergent, 
superimposable strand topologies and a common hydrophobic 
core are preserved. It seems likely, however, that evolution to 
the antibody family, with its vast repertoire or dimeric receptor 
units, was a later event, and that CD4 is more prototypical 
of immunoglobulin superfamily receptors which are often 
monomelic and nonpolymorphic. The absence from CD4 of 
Mike regions, which impart diversity in antibodies, is consistent 
with this. The progenitor of the diverse immunoglobulin super- 
family of the present may have its vestige in the gene structure 
of Dl, which is split by an intron (at position 47) as are genes 
for other immunoglobultn-like domains* 1 - 52 . A quasi-diad axis, 
perpendicular to the sheets and passing midway between strands 
B and E on one sheet and between strands C and F on the 
other, can be used to superimpose successive strands in the first 
exon on those in the second exon. An immunoglobulin pro- 
genitor produced by a gene duplication event" might have been 
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V-like, but a D2-likc progenitor that would evolve to V and C 
domains is also a possibility. 

Whatever the course of evolution, it is remarkable that 
molecules that function in recognition are often members of the 
immunoglobulin gene family. Clearl y the i mmunoglobulin fold 
provides a facile framework from which a variety of loops with 
distinctive characteristics can be elaborated. These loops can 
have distinct functions, as in binding to gp!20 at the CDR2-likc 
loop and affecting fusion from the CDR3-like loop, and such 
modularity might have evolutionary advantage. Similarly, the 
quasi-symmetric nature of the immunoglobulin motif, with N 
and C termini at opposite ends, facilitates the concatenation of 
tandem, flexibly linked modules that can have distinct roles. In 
the case of CD4, one domain might be involved in class II 
binding whereas another domain might effect self-associations 
of the kind found in crystalline polymorphs 1 *, and these could 
be essential for signal transduction. Indeed, in an alignment of 
D3 with Dl, dimerization loops of immunoglobulin V domains 
would not be foreshortened in D3 as they are in PI- Such 
modules could evolve separately and be rccombincd by 
exon shuffling to integrate complex recognition and effector 
functions. D 
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Entropy production as the 
selection rule between 
different growth morphologies 

Adrian Hill 

Physiological Laboratory. Downing Street, Cambridge C82 3EG, UK 

CttYSi ALMZATloN of a solid phase from a melt or solution is 
a special case of pattern formation in which the dissipation of 
energy across the free energy gradient between the two phases can 
give rise to various growth morphologies in the steady state'. 
Experimental studies of crystallization from undercooled sol- 
utions 2 , electrolytic deposition 3,4 and the formation of fluid pat- 
terns in a Hele-Shaw cell 3 have revealed faceted, dendritic, 'dense- 
branching* and fractal morphologies 6 . For a system with fixed 
anisotropy and imerfaeial tension, changes in the driving force for 
the transition (such as the degree of undercooling) can cause 
changes in growth morphology which arc usually accompanied by 
changes in growth rate. The selection rule that determines these 
morphologies remains unclear, although a recent suggestion*' 6 is 
that it is based on the growth velocity. Here I propose that selection 
is governed by the rate of entropy production per unit area of the 
different growth patterns. This principle allows accurate prediction 
of the morphology transition observed for the crystallization of 
NH4CI (rof. 2). I suggest that it may reflect a more genera) 
thermodynamic principle underlying a wide range of natural pro- 
cesses. 

Consider the formation of a generated 'crystal* in which the 
driving force X (ignoring surface terms) is proportional to either 
supersaturation, pressure or other fields. The form of the crystal 
involves many surface orientations which have a different sur- 
face free energies. Denoting this surface free energy / by 



f^u~Ts 



(I) 



where w and s are the surface energy and entropy of the crystal, 
the velocity V representing the rate of crystallization is expressed 
as a linear function 

V=UX-0) (2) 

where X is the difference between the free energies of the 
dissolved solute and the bulk solid and 0 is the free energy 
required to increase the area, so that X - 0 is the total driving 
force, and L is the rate coefficient reflecting the growth rate of 
a particular morphology and its rate of entropy production. The 
dissipation is given by 

4 = dS/6i = V{X -e) = UX - 0) 2 (3) 

where 0 is a linear function of / and S is the entropy created 
in the crystallization. A model of the process of growth that 
relates velocity to driving force is a model of the rate coefficient 
and a link between the kinetics and the entropy production. 
Ben-Jacob and Garik 6 Suggest that there must be a conjugate 
variable to the average rate of crystal growth. This conjugate 
variable is the force (X-6), Equations (2) and (3) assume 
linear phenomenological relations but these arc merely con- 
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venient and not essential: they almost certainly do not hold far 
from equilibrium or when chemical reactions occur. 

The dissipation functions of two morphologies formed by 
adjacent steady states will in general have crossover points (Fig. 
1). It follows directly from equation (3) that the crossover point 
for adjacent morphologies occurs at the driving force 



(aW*> 



(4) 



There arc entropic crossovers because equation (3) is a parabola 
in the <p-X plane: the section between A' = 0 and X - 6 rep- 
resents a disaggregation of the crystal structure. This docs not 
normally occur in experiments unless crystals arc initially pres- 
ent, in which case one morphology disappears as the other is 
formed. As is always positive, I consider for the moment only 
the parts of the entropy curves with positive gradients because 
these represent crystallization, that is, the forward reaction 
driven by X. Reference to Fig. 1 shows that if a crossover point 
is to occur in this domain when 0, < 6 2 then L x < L 2 , At higher 
driving forces new crossovers occur if there are steady states 
that can exist by creating other morphologies with higher values 
of L. When the driving force is reduced from a high value, the 
principle of selection for the steady state with highest value of 
r£ predicts that at the crossover points (c, and c 2 in Fig. 0 
transitions occur to different steady states producing different 
crystal morphologies. In the linear regime these transitions can- 
not occur without discontinuities in the rate of crystallization 
because, as follows from equations (2) and (4), the only point 
free from such discontinuity is the trivial one 



(5) 



in which the rates are zero. Small differences in entropy produc- 
tion rates at a crossover, however, could lead to small changes 
in slope together with small discontinuities which may be 
masked by variation in the data. Discontinuities are indeed a 
striking feature of morphology changes in various experimental 




FiG. 1. The crossing points q x and c 2 of tne entropic curves e^e^ (solid 
lines) of three morphologies. The associated velocities (dashed lines) 
are also shown. The constants for the three entropic curves are L 7 =0.5, 
6\ = -10; (-2=4. *a = l: i 3 -87. 6\, = 12. 
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